Near-Optimal Computation of Runs over General Alphabet via Non-Crossing LCE Queries
نویسندگان
چکیده
Longest common extension queries (LCE queries) and runs are ubiquitous in algorithmic stringology. Linear-time algorithms computing runs and preprocessing for constant-time LCE queries have been known for over a decade. However, these algorithms assume a linearly-sortable integer alphabet. A recent breakthrough paper by Bannai et. al. (SODA 2015) showed a link between the two notions: all the runs in a string can be computed via a linear number of LCE queries. The first to consider these problems over a general ordered alphabet was Kosolobov (Inf. Process. Lett., 2016), who presented an O(n(log n))time algorithm for answering O(n) LCE queries. This result was improved by Gawrychowski et. al. (accepted to CPM 2016) to O(n log log n) time. In this work we note a special non-crossing property of LCE queries asked in the runs computation. We show that any n such non-crossing queries can be answered on-line in O(nα(n)) time, which yields an O(nα(n))-time algorithm for computing runs.
منابع مشابه
Faster Longest Common Extension Queries in Strings over General Alphabets
Longest common extension queries (often called longest common prefix queries) constitute a fundamental building block in multiple string algorithms, for example computing runs and approximate pattern matching. We show that a sequence of q LCE queries for a string of size n over a general ordered alphabet can be realized in O(q log log n + n log n) time making only O(q + n) symbol comparisons. C...
متن کاملSmall-space encoding LCE data structure with constant-time queries
The longest common extension (LCE) problem is to preprocess a given string w of length n so that the length of the longest common prefix between suffixes of w that start at any two given positions is answered quickly. In this paper, we present a data structure of O(zτ + n τ ) words of space which answers LCE queries in O(1) time and can be built in O(n log σ) time, where 1 ≤ τ ≤ √ n is a parame...
متن کاملOptimal Success Bounds for Single Query Quantum Algorithms Computing the General SUM Problem
In this thesis the problem of computing the sum of a string of arbitrary finite length drawn from an arbitrary finite alphabet is treated. The resource considered is the number of queries to some oracle which hides the string and gives access to each of its digits. The sum of a string is defined as adding all of the string’s digits together modulo the alphabet size. Classically, this problem is...
متن کاملOn Runs in Independent Sequences
Given an i.i.d. sequence of n letters from a finite alphabet, we consider the length of the longest run of any letter. In the equiprobable case, results for this run turn out to be closely related to the well-known results for the longest run of a given letter. For coin-tossing, tail probabilities are compared for both kinds of runs via Poisson approximation.
متن کاملLongest Common Extensions via Fingerprinting
The longest common extension (LCE) problem is to preprocess a string in order to allow for a large number of LCE queries, such that the queries are efficient. The LCE value, LCEs(i, j), is the length of the longest common prefix of the pair of suffixes starting at index i and j in the string s. The LCE problem can be solved in linear space with constant query time and a preprocessing of sorting...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016